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Cloned DNA loauencps related to the genomic RHA-Qf_lvmpha- 

denopathy-aaiociated-virii> — LLAYJ_and PrQtfina ffnCQflqg fry 

«>id LAV aenomic RNA 

The invention relates to cloned DNA sequences 
5 indistinguishable from genomic RNA and DNA of lymphs* 
denopathy-associated virus (LAV), a process for their 
preparation and their uses. It relates more particularly 
to stable probes including a DNA sequence which can be 
used for the detection of the LAV virus or related viruses 
10 or DNA proviruses in any medium, particularly biological 
samples containing any of them. The invention alao relates 
to polypeptides, whether glycosylated or not, encoded by 
said DNA sequences. 

Lymphadenopathy-associated virus (LAV) is a human 
13 retrovirus first isolated from the lymph node of a homo¬ 
sexual patient with lymphadenopathy syndrome, frequently s 
prodrome or a benign form of acquired immune deficiency 
syndrome (AIDS). Subsequently other LAV isolates have been 
recovered from patients with AIDS or pre-AIOS. All availa- 
20 pie data are consistent with the virus being the causative 
agent of AIDS. 

A method for cloning such DNA sequences has alrea¬ 
dy been disclosed in British Patent Application Nr. 

64 23659 filed on September 19. 1984. Reference it here- 

25 after made to that application as concerns subject matter 
in common with the further improvements to the invention 
disclosed herein. 

The present invention aims st providing additional 
new means which should not only also be useful for the 
30 detection of LAV or related viruses (hereafter more 
generally referred to as "LAV viruses"), but also have 
more versatility, particularly in datecting specific parts 
of the genomic DNA of said viruses whose expression pro¬ 
ducts are not always directly detectable by immunological 
35 methods. 

The present invention further alma at providing 
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polypeptide* containing sequence* in common with polypep¬ 
tide* encoded by the LAV genomic RNA. Zt relate* even more 
particularly to polypeptide* comprising antigenie deter¬ 
minant* included in the proteins encoded end expressed by 
5 the LAV genome occuring in nature. An additional object of 
the invention is to further provide mean* for the 

detection of proteins related to LAV virus. particularly 
for the diagnosis of AZOS or pre-AIDS or, to the contrary* 
for the detection of antibodies against the LAV virus or 
to proteins related therewith, particularly in patients 

afflicted with AZOS or pre-AIDS or more generally in 
asymtomatic carriers and in blood-related products. 

Finally the invention also aims at providing immunogenic 
polypeptides. and more particularly protective 
15 polypeptides for use in the preparation of vaccine 

compositions against AIDS or related syndroms. 

The present invention relates to additional DMA 
fragments, hybridizabie with the genomic RNA of LAV as 
they will be disclosed hereafter, at well as with additio- 
20 nal cDNA variants corresponding to the whole genomes of 
LAV viruses. It further relates to ONA recombinants con¬ 
taining said DMAs or cDNA fragments. 

The invention relates more particularly to a cONA 
variant corresponding to the whole of LAV retroviral 
25 genomes, which is characterized by a series of restriction 
sites in the order hereafter (from the 5* end to the 3' 
end) . 

The coordinates of the successive sites of the 
whole LAV genome (restriction map) are indicated hereafter 
30 too. with respect to the Hind Ill site (selected as of 
coordinate 1) which is located in the R region. The 
coordinates are estimated with an accuracy of i 200 bp : : 


Hind ZZI 0 

Sac I 50 

35 Hind III 520 

Pst I 800 

Hind XII 1 100 
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8gl II 1 300 

Kpn X 3 500 

Kpn X 3 900 

Eeo RI 4 100 

5 Eco RI 5 300 

Sal I 5 S00 

Kpn I 6 100 

Bgl XI 6 500 

Bgl XX 7 600 

10 Hind III 7 050 

Bam HI 3 150 

Xho X 0 600 

Kpn X 0 700 

001 XI 8 750 

15 8gl II 9 ISO 

Sac X 9 200 

Hind XII 9 250 


Another DNA variant according to this invention 
optionally contains an additional Hind III approximately 
20 at the 5 550 coordinate. 

Reference is further made to fig. 1 which shows a 
more detailed restriction map of said whole-DNA 1X319). 

An even more detailed nucleotidie sequence of a 
preferred DNA according to the invention is shown in fig. 
25 4-12 hereafter. 

The invention further relates to other preferred 
DNA fragments which will be referred to hereafter. 

Additional features of the invention will appear 
in the course of the non-limltative disclosure of additio- 
30 nal features of preferred DMAs of the invention, as well 
as of preferred polypeptides according to the invention. 
Reference will further be had to the drawings in which : 

-' fig. 1 is the restriction map of s complete LAV genome 
(clone X319) ; 

figs. 2 and 3 show diagrammatically parts of the three 
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possible reading phases of LAV genomic RNA, including the 
open reading frames (ORF) apparent in each of said reading 

phases ; 

- figs. 4*12 show the successive nucleotidic sequences of 
5 a complete LAV genome. The possible peptidie sequences in 
relation to the three possible reading phases related to 
the nucleotidie sequences shown are also indicated t 

figs. 13-18 reiterate the sequence of part of the LAV 
genome containing the genet coding for the enveloppe pro- 
10 tains, with particular boxed peptidie sequences which cor* 
responds to groups which normally carry glycosyl groups. 

The sequencing and determination of sites of psr* 
ticular interest was carried out on a phage recombinant 
corresponding to AJ19 disclosed in the aboveiaid British 
IS Patent application Nr. 84 23659. A method for preparing it 
is disclosed in that application. 

The whole recombinant phage DMA of clone AJ19 
(disclosed in the earlier application) was sonicated 
according to the protocol of DEXNINGER (1983), Analytical 
20 Biochem. 129, 216. the ONA was repaired by a Klenow 

reaction for 12 hours at 16*C. The DNA waa electrophoresed 
through 0.8 X agarose gel and DNA in the size range of 
300*600 bp was cut out and electroeluted and precipitated. 
Resuspended ONA (in 10 mM Tris, pH 6 i 0,1 mH EDTA) was 
25 ligated into M13mp8 RF DNA (cut by the restriction en2yme 
Smal and subsequently alkaline phosphated), using T4 DNA- 
and RNA-ligasea (Haniatis T et al (1982) - Molecular 
cloning * Cold Spring Harbor Laboratory). An £. ggli 
strain designated as TGI was used for further study. This 
30 strain haa the following genotype : 

Alac pro, supE, thi.F*traD36, proAB, laci q , ZAM15,r~ 

This jL. coif TGI strain haa the peculiarity of 
enabling recombinants to be recognized easily. The blue 
colour of the cells transfected with plasmids which did 
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not recombine with a fragment of LAV DNA la not modified. 
To the contrary cell* transfected by a recombinant plasmid 
containing a LAV DNA fragment yield white colonies. The 
technique which was used is disclosed in Gene (1963), £&< 

5 101. 

This strain was transformed with the ligation mix 
using the Hanahan method (Kanahan D (1963) 3. Hoi. Biol. 
166. 557). Cells were plated out on tryptone-agarote plate 
with IPTG and X~gal in soft agarose. White plaque* were 
10 either picked and screened or screened directly Aa situ 
using nitrocellulose filters. Their DNAs were hybridized 
with nick-translated DNA inserts of pUCIB Hind III 
subclones of All9. this permitted the isolation of the 
plasmids or subclones of A which are identified in the 
15 table hereafter. In relation to this table it should also 
be noted that the designation of each plasmid is followed 
by the deposition number of a cell culture of £. coji TGI 
containing the corresponding plasmid at the "Collection 
Nationale des Cultures de Hicro-organismes" (C.N.C.H.) of 
20 the Pasteur Institute in Paris, Prance. A non-trensformed 
TGI cell line was also deposited at the C.N.C.H. under Nr. 
I-36*. All these deposits took piece on November 15, 1984. 
The si 2 es of the corresponding inserts derived from the 
LAV genome have also been indicated. 
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pJ19 - 1 plasmid 11-365 I 0.9 kb 

Hind III - Sac I - Hind III 
pjl9 - IT plasmid (1-367) 0.6 kb 

Hind III - Pat 1 - Hind III 
pJ19 - 6 plasmid (1-366) t.5 kb 


15 Hind III (5) 

Bam HI 
Xho I 
Kpn X 
Bgl XX 

20 Sac I O') 

Hind III 


- pit9-13 plasmid (1-366) 6.7 kb 

25 Hind III (5*) 

Bgl IX 
Kpn I 
Kpn I 
ECO RI 

30 Eco Rl 

Sal X 
Kpn I 
Bgl IX 
B 9 I It 

Hind XXI (3‘) 
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Positively hybridizing 'HI3 phage plates were grown 
up for 5 hours and the single-stranded DMAs were 
extrscted. 

HI3mp8 subclones of AJ19 ONAs were sequenced 
according to the dideoxy method and technology devised by 
Sanger et al (Sanger et al (1977). Proc. Natl. Acad. Sei. 
USA. it . 5483 and H13 cloning and sequencing handbook. 

AMERSHAM (1963). the 17-mer oligonucleotide primer 
n- 3S SdATP UOOCi/mmol. AMERSHAH). and O.SX-SX bufftr 
gradient gels (fiiggen M*0. et al (1963. Proc. Natl. Acad. 
Sei. USA. SO . 3963) were used. Gels were read and put into 
the computer under the programs of Staden (Staden R. 
(1982). Nucl. Acids Res. 10. 4731). All the appropriate 
references and methods can be found in the ANERSHAH H13 
cloning and sequencing handbook. 

The complete sequence of AJ19 was deduced from the 
experiments as further disclosed hereafter. 

Pigs. 4-12 provide the DNA nucleotidie sequence of 
the complete genome of LAV. The numbering of the 
nucleotides starts from a left most Hind til restriction 
site (S'AAC..) of the restriction map. The numbering 
occurs in tens whereby the last zero number of each of the 
numbers occuring on the drawings is located just below the 
nucleotide corresponding to the nucleotides designated. 
I.e. the nucleotide at position 10 is T. the nucleotide at 
position 20 is C. etc.. 

Above each of the lines of the successive nucleo- 
tidic sequences there are provided three linea of single 
letters corresponding to the aminoacid sequence deduced 
from the DNA sequence (using the genetic eode) for each at 
the three reading phases* wheraby said single letters have 
the following meaninga. 

A : alanine 
R : arginine 
K t lysine 
H ; histidine 
C t cysteine 
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methionine 
tryptophan 
phenylalanine 
tyroaina 
leucine 
valine 
isoleucine 
glycine 
threonine 
serine 

glutamic acid 
Aspartic acid 
asparagine 
glutamine 
proline* 

asterik signs correspond to stop codons 

( i . e. TAA, TAG and TGAI. 

Starting above the first line of the DNA 

nucleotidic sequence of fig* 4 the three reading phases 
20 are respectively marked ”1", "2* # *3", on the left 
handaide of the drawing. The same relative presentation of 
the three theoritical reading phases is then used all over 
the successive# lines of the LAV nucleotidic sequence. 

Figs. 2 and 3 provide a diagrammatized represent 
25 tation of the lengths of the successive open reading 
frames corresponding to the successive reading phases 
(also referred to by numbers “1", m Z m and *3* appearing in 
the left handside part of fig. 2). The relative positions 
of these open reading frames (ORF) with respect to the 
30 nucleotidic structure of the LAV genome is referred to by 
the scale of numbers representative of the respective 
positions of the corresponding nucleotides in the DNA 
sequence. The vertical bars correspond to the positions of 
the corresponding stop codons. 

35 i) The "oag gone* (or ORF-oao) 

The "gag gene" codes for core proteins. 


M ; 
W : 

F : 

Y i 

5 L : 

V j 
Z : 
G : 
T : 

10 S : 

E : 

0 1 
N : 
Q : 

15 P : 

The 


i 
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Particularly it appear* that a genomic fragment (ORF-gsg) 
thought to code for the core antigen* including the p25, 
pia and pl3 protein* 1* located between nucleotidic 
position 236 (starting with 5* CTA GC6 GAG 3) and 
5 nucleotidic position 1759 (ending by CTCG TCA CAA 3'). The 

structure of the peptide* or protein* encoded by parts of 
said ORP is deemed to be that corresponding to phase 2. 

The methionine aminoacid “M - coded by the ATG at 
position 260-262 is the probable initiation methionine of 
10 the gag protein precuraor. The end of ORF-gag and 

accordingly of gag protein appears to be located at 
position 1759. 

The beginning of p25 protein, thought to start by 
a P-I-V-Q-N-I-Q-G-Q-M-V-H .... aminoacid sequence is 
15 thought to be coded for by the nucleotidic sequence 

CCTATA,.., starting at position 656. 

Hydrophilic peptide* in the g*g open reading frame 
are identified hereafter. They are defined starting from 
aminoacid 1 = Met (HI coded by the ATG starting from 260-2 
20 in the LAV ONA sequence. 

Those hydrophilic peptides sre 
12-32 aminoacid* inclusive 
37-46 
49-79 

25 88-153 

158-165 
178-188 
200-220 
226-234 
30 239-264 

268-331 
352-361 
377-390 
399-432 
437-484 
492-498 


35 
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The invontion also relate* to any combination of 
these peptides. 

2) Tho "ool gene* _iox ORF-ooll 

Plot. 4-12 alao show that tho DNA fragments 
5 extending from nuclootidie position 1555 (starting with 
5*TTT TTT .**,3* to nucXootidic position 5006 is thought 
to correspond to the poX gone, Tho poXypoptidic structure 
of tho corresponding polypeptides is deemed to be that 
corresponding to phase 1. It stops at position 4563 (end 
10 by 5*G GAT GAG GAT 3*). 

These genes are thought to code for the virus 
polymerase or reverse transcriptase. 

3) The envelope gene (or Mf-env) 

The DNA sequence thought to code for envelope 
15 proteins is thought to extend from nucieotidic position 
5670 (starting with 5*AAA GAG GAG A....3*) up to nucieo¬ 
tidic position 6132 (ending by ....A ACT AAA GAA 3')* 
Polypeptidic structures of sequences of the envelope 
protein correspond to those read according to the "phase 
20 3“ reading phase. 

The start of env transcription is thought to be at 
the level of th ATG codon at positions 5691*5663. 

Additional feature of the envelope protein coded 
by the env genes appear on figs. 13-to. These are to be 
25 considered as paired figs. 13 and 14 ; 15 and 16 t 1? and 
16 respectively. 

It is to be mentioned that because of format 
difficulties. 

Fig. 14 overlaps to some extent with fig. 13. 

30 Fig. 16 overlaps to some extent with fig. 15. 

Fig. 16 overlaps to some extent with fig. 17. 

Thus for instance figs. 13 and 14 must be con¬ 
sidered together. Particularly the sequence shown*on the 
first line on the top of fig. 13 overlaps with the 
35 sequence shown on the first line on the top of fig. 14. in 
other words the 'starting of the reading of the successive 







sequences of the env gene as represented in figs. 13*18 
involves first reading the first line at the top of fig. 
13 then proceeding further with the first line of fig. 14. 
One then returns to the beginning of the second line of 
fig. 13i then again further proceed with the reading of 
the second line of page 14. etc... The same observations 
then apply to the reading of the paired figs. 15 and 16. 
and paired figs. 17 and 18. respectively. 

The locations of neutralizing epitopes are further 
apparent in figs. 13-16. reference is more particularly 
made to the boxed groups of three letters included in the 
aminoacid sequencas of the envelope proteins (reading 
phase 3) which can be designated generally by tha formula 
H-X-S or N-X-T, wherain X is any othar possibla aminoacid. 
Thus the initial protein product of the env gene in • gly¬ 
coprotein of molecular weight in excess of 91,000. These 
groups are deemed to generally carry glycosylated groups. 
These N-X-S and N-X-T groups with attached glycosylated 
groups form together hydrophylic regions of the protein 
and are deemed to be located at tha periphery of and to be 
exposed outwardly with respect to the normal conformation 
of the proteins. Consequently they are considered ss being 
epitopes which can efficiently be brought into play in 
vaccine compositions. 

The invention thus concerns with more perticulari- 
ty peptide sequences included in the env-proteins and 
excisable therefrom (or having the same aminoacid struc¬ 
ture). having sizes not exceeding 200 aminoacids. 

Preferred peptides of this invention (referred to 
hereafter as a, b. e, d, e, f) are deemed to correspond to 
those encoded by the nucleotide sequences which extend 
respectively between the following positions : 


a ) 

from 

about 

6095 

to 

about 

6200 

b) 

- 

m 

6260 

. » 


6310 

c) 


m 

6390 

. - 

* 

6440 

d) 

m 

m 

6465 

M 

* 

6620 
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•r - - 6860 " " 6930 

f) “ " 7535 * * 7630 

Other hydrophilic peptides in the tnv open reeding 
frtme are identified hereafter, they are defined starting 
5 from 

aminoacid 1 * lysine (K) coded by the AAA at position 
5670-2 in the LAV DNA sequence. 

These hydrophilic peptides are 
8-23 aminoacidt Inclusive 
to 63-76 

82-90 “ * 

97-123 
127-163 
197-201 
15 239-294 

300-327 
334-381 
397-424 
466-500 
20 510-523 

551-577 
594-603 
621-630 
657-679 
25 719-758 

780-603 

The invention also relates to any combination of 
these peptides. 

4) Hie. othor PRF 

30 The invention further concerns DNA sequences which 

provide open reading frames defined as ORF-O, ORF-R and as 

“I", *2*, # 3"» "4". *5** the relative position of which 

appears more particularly in figs. 2 and 3. 

These ORFa have the following locations ; 

35 ORF-O phase 1 start 4478 stop 5086 

ORF-R ' 2 “ 6249 “ 6696 
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ORF-1 

M 

1 

5029 

“ 

5316 

ORF-2 

M 

2 

"■ 5273 

- 

5515 

ORF-3 

** 

1 

M 3303 

- 

5616 

ORF-4 

Vf 

2 

5519 

- 

5773 

ORF-5 

** 

1 

" 7965 

* 

8279 


The LTR 

< long 

terminal repeats) 

can be defined as 

lying 

between 

position 8560 and 

position 160 (end exten- 


ding over position 9097/1). As a matter Of fact the end of 
the genome is at 9097 and, because of the Ltd structure of 
10 the retrovirus, links up with the beginning of the 
sequence : 

Hind 

CTCAATAAAGCTTGCCTTG 

n 

IS 9097 1 

The invention concerns more particularly all the 
DNA fragments which have been more specifically referred 
to hereabove and which correspond to open reading frames. 
It will be understood that the man skilled in the art will 
20 be able to obtain them all. for instance by cleaving an 
entire DNA corresponding to the complete genome of a LAV 
species, such as by cleavage by a partial or complete 
digestion thereof with a suitable restriction enzyme and 
by the subsequent recovery of the relevant fragments. The 
25 different DMAs disclosed in the earlier mentioned British 
Application can be resorted to also as a source of sui¬ 
table fragments. The techniques disclosed hereabove for 
the isolation of the fragments which were then included in 
the plasmids referred to hereabove and which were then 
30 used for the DNA sequencing can be used. 

Of course other methods can be used. Some of them 
have been examplified in the earlier British Application, 
reference is for instance made to the following methods. 

a) DNA can be transfected into mammalian cells 
35 with appropriate selection markers by a variety of tec¬ 
hniques, calcium phosphate precipitation. polyethylene 
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glycol, protoplast-fusion, ate.. 

b) ONA fragment* corresponding to genes can bo 
cloned into expression vectors for £. £All . yeast- or 
mammalian cells and the resultant proteins purified. 

5 c) The provival ONA can be "shot-gunned - (frag¬ 

mented) into procaryotic expression vectors to generate 
fusion polypeptides. Recombinant producing antigenically 
competent fusion proteins can be identified by simply 
screening the recombinants with antibodies against LAV 
10 antigens . 

The invention also relates more specifically to 
cloned probea which can be made starting from any ONA 
fragment according to this invention, thus to recombinant 
ONAs containing such fragments, particularly any plasmids 
15 amplifiable in procaryotic or eucaryotic cells and carry¬ 
ing said fragments. 

Using the cloned DNA fragments as a molecular hy¬ 
bridization probe - either by marking with radionucleo¬ 
tides or with fluorescent reagents - LAV virion RNA may be 
20 detected directly in the blood, body fluida and blood 
products (c.g. of the antihemophylic factors such as 
Factor VIII concentrates) and vaccines, i.e. hepatitis B 
vaccine. It has already been shown that whole virus can be 
detected in culture supernatants of LAV producing cells. A 
25 suitable method for achieving that detection comprises 
immobilizing virus onto said a support e.g. nitrocellulose 
filters. etc., disrupting the virion and hybridizing with 
Labelled (radiolsbelled or "cold** fluorescent- or 
enzyme-labelled) probes. Such an approach has already been 
30 developed for Hepatitis B virus in peripheral blood 
(according to SCOTTO J, et al. Hepatology (1903), 3. 
379-384). 

Prnhem according to the invention can also be used 
for rapid screening of genomic ONA derived from the tissue 
35 of patients with LAV related symptoms, to see if the pro- 
viral DNA or RNA is present in host tissue end other 
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tltftUOft. 

A method which can be usad for such screening 
comprise the following steps t extraction of DMA from tis- 
sue. restriction enzyme cleavage of said ONA. electro- 
5 phoresis of the fragments and Southern blotting of genomic 
DNA from tissues, subsequent hybridization with labelled 
cloned LAV provival ONA. Hybridization In jsllu can also be 
used. 

Lymphatic fluids and tissues and other non-lympha- 
10 tic tissues of humans, primates and other mammalian 
species can also be screened to see if other evolutionnsry 
related retrovirus exist. The methods referred to here- 
above can be used. although hybridization and washings 
would be done under non stringent conditions. 

15 The DNA sccording to the invention can be used 

also for achieving the expression of LAV viral antigens 
for diagnostic purposes. 

The invention slso relates to the polypeptides 
themselves which can be expressed by the different ONAs of 
20 the inventions. particularly by the ORFs or fragments 
thereof, in appropriate hosts, particularly procaryotic or 
eucaryotic hosts. after transformation thereof with a 
suitable vector previously modified by the corresponding 
DNAs. 

25 These polypeptides can be used as diagnostic 

tools. particularly for the detection of antibodies in 
biological media, particularly in sera or tissues of 
persons afflicted with pre-AIDS or AIDS, or simply 
carrying antibodies in the absence of any apparent 
30 disorders, Conversely the peptides according to 

this invention can be used themselves for the production 
of antibodies, preferably monoclonal antibodies specific 
of the different peptides respectively. For the production 
of hybridomas secreting said monoclonal antibodies 
35 conventional production and screening methods are used. 
These monoclonal antibodies, which themselves are part of 
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the invention then provide very useful tools for the 
identification and even determination of relative 

proportions of the different polypeptides or proteins in 
biological samples, particularly human samples containing 
5 tAV or related viruses. 

Thus all of the above peptides can be used in 
diagnostics as sources of immunogens or antigens free of 
viral particles, produced using non-permissive systems, 
and thus of little or no biohazard risk# 

10 The invention further relates to the hosts (proca¬ 

ryotic or eucaryotic cells) which are transformed by the 
above mentioned recombinants and which are capable of 
expressing said DNA fragments. 

Finally it also relates to vaccine compositions 
IS whose active principle is to be constituted by any of the 
expressed antigens, i.e. whole antigens, fusion polypep¬ 
tides or oligopeptides in association with a suitable 
pharmaceutical or physiologically acceptable carrier. 

Preferably the active principles to be considered 
20 in that field consist of the peptides containing less than 
250 aminoacid units, preferably less than 150 as deducible 
for the complete genomes of LAV, and even more preferably 
those peptides which contain one or more groups selected 
from N-X-S and N-X-T as defined above. Preferred peptides 
25 for use in the production of vaccinating principles are 
peptides (a) to (f) as defined above. By way of example 
having no limitative character, there may be mentioned 
that suitable dosages of the vaccine compositions are 
those which enable administration to the host, 

30 particularly human host ranging from 10 to 500 micrograms 
per kg, for instance 50 to 100 micrograms per kg* 

For the purpose of clarity figs. 19 to 26 are 
added, reference may be made thereto in case of difficul¬ 
ties of reading blurred parts of figs. 4 to 12. 

35 
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Needless to say that figs. 19-25 are merely a 
reiteration of tha wholo ONA sequence of tha tAV genome. 

Finally tha invantion also concarns vectors for 
the transformation of aucaryotic calls of human origin, 
particularly lymphocytes, tha polymerases of which ara 
capable of recognizing the LTRs of LAV. Particularly said 
vectors are characterized by tha pretence of a LAV LTR 
therein, said LTR being then active as a promoter enabling 
tha efficient transcription and translation in a suitable 
host of tha above defined, of a ONA insert coding for a 
determined protein placed under its controls. 

Needlesa to say that the invention extends to all 
variants of genomes and corresponding DNA fragments (ORFs) 
having substantially equivalent properties, all of said 
genomes belonging to retroviruses which can be considered 
at equivalents of LAV. 
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1, A DNA fragment of LAV extending from nucleotide 
position 236 to nucleotide position 1759. 

2. A ONA fragment of LAV extending from nucleotide 
5 position 1555 to nucleotide position 5066. 

3• A ONA fragment of LAV extending from nuclootido 
position 5670 to nuclootido position 8132. 

4. A voctor containing a ONA fragment according to 
any of claims 1 to 3. 

10 5. Peptido corresponding to any of those encoded 

by the nucleotide sequences which extend respectively 
between the following positions ; 


a) 

from about 

6095 

to 

about 

6200 

b) 

m m 

6260 

* 

- 

8310 

c) 

m m 

6390 

- 


6440 

d) 

m m 

6485 


- 

6620 

e) 

W m 

6660 

- 

- 

6930 

f) 

m m 

7535 

« 

IS 

7630 


6. Peptide characterized by a sequence of amino- 
20 acids deducible from LAV ONA the terminal aminoacids of 
which extend between the following positions with respect 
to the lysine (position 1) coded by the AAA at position 

5670-5672 in the LAV ONA. 

6-23 aminoacids inclusive 
25 63-78 

62-90 
97-123 * 

127-163 * 

197-201 * . " 

30 239-294 " 

300-327 * 

334-381 " 

397-424 “ 

466-500 - 

510-523 " 

551-577 “ 


35 
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394-803 * 

621-630 w 
657-679 B 
719-758 “ 

S 760-003 ' 

or any combination of thoso peptides. 

7. Peptide corresponding to the aminoacid 
saquoncoa doduciblo from LAV DNA and the terminal 

aminoacid* of which are positionned at the position* 
10 hereafter counted from the Het at position 1 coded by the 
ATG sequence at nucleotide position* 260-2 : 


t2-32 aminoacid* 

inclusive 

37-46 


49-79 

m 

89-153 

m 

156-165 

IV 

170-168 

m 

200-220 

m 

226-234 

m 

239-264 


286-331 


352-361 


377-390 


399-432 


437-494 

m 

492-496 

* 


and combination of *aid peptide*. 

B. Diagnostic mean* containing any of the DMA 

fragments of any of claims 1 to 3. 

9. Diagnostic means containing any of the peptides 

of any of claims 4 to 6. 

10. Vaccine compositions containing any of the 
peptides according to any of claims i to 8 in association 
with a pharmaceutical vehicle. 


E ^\0l 0 


35 
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C. .CAGACCAGAGCAAGAAATCoAoLC AGTAGATCC.T AC ACT AC AGCC CT GG A A GC AT C C ACC A AG TC ACC C TA/ 
5290 5300 5310 5320 5330 5350 5350 


,P S L F M N K S L R H L L rf 0 E E A E TATKTS 

ovcfttkalgisygrkkrrorrrpp 

KFVS00KP*A5PMAC R S GOSOEOL 
CCAAGTTTGTTTCACAACAAAAGCCTrAGGCArCTCCTATGGCAGGAAGAAGCCCAGACACCGACGAAGACCTGC 
5510 5520 5430 5440 5450 5460 5470 


STCNATVTNSNSSISSSNNNSNSCV 
V H V . H 0 P IOJ A I A A L VVA I I I A I V V W 
Y1*CNLYK*0*QH***Q * • ♦ 0 ♦ L C 

AGTACATGTAATGCAACCTATACAAATAGCAATAGCAGCATTAGTAGTAGCAATAATAATAGCAATAGTTGTGTG 
5530 5540 5550 5560 5570 5580 5590 


'I ■» a V N • * TNRKS.RRQWQ # E ♦ R R N I 5 

IUkLlDRLIERAEDS.GNESEGE ISA 
♦TG*LI D**KE QKTVANRVKEKYO 
AATAGACAG(.TTAATTGATAGACTAATACAAAGAGCAGAAGACAGTGGCAATGAGAGTGAAGGAGAAATATCAGC 
5650 5660 5670 5680 5690 5700 5710 


Y**$VVL0KNCG5QSIHGYLCGRKQ 
lOOLtCVRKIVCHSLLWGTCVEGSN 
L N I C'S^A T EK L H V T V Y Y G V P V U K E A 
TATTGATGATCTGTAGTGCTACACAAAAATTGTGGCTCACACTCTATTATGGGCTACCTGTCTGGAAGCAACCAA 
5770 5780 5790 5800 5310 5820 5330 


RYl5FGPHHPVV.P0TPTHKK*Yi**H 
GTACLGHTC LCTHRPQP TRS S I CYC 
VHNV wAT .HAC VP.TD.. PN P OE VV LV h 
ACGTACATAATCTTTCGGCC ACACATGCCTGTCTACCCACAGACCCCAACCCACAACAACTAGTATTCCTAAATG 
5670 5300 5910 5920 5930 5940 5950 


C H R I ♦ $"V Y—G* I K *A' * " S H" V * N ' * ' P ' H' S'‘V' U V 

a*cvnof ?:gs«pka .'ick i n p t l c * f 

HEDI ISL rtOOS-LKPC*VKLTPLCVSt 
TCCATCAGCATArAArCACTTTArCCGArCAAACCCTA4AGCCATGTCTAAA4TTAACCCCACTCTCTGTTACTT 
5010 6020 6030 6060 6050 6060 6070 


I P I VVA GK • ' • tf R . K E K * K T A L 5 I S A 3 
y 0 * •RGNODGFRROK K L L F 0 Y Q rt K 

T S l S G E 1 N fl t K G E I K b C Si F IN 1 s] T ! 

A7ACCAATAGIAGTAGCGCGGAAATGATGATCGACAAAGGACAG ATAAAAAACTCCTCTTTCAATA7C ACCACAAC 
6130 6160 6150 6160 6170 6180 6190. 


LI*YO* r M ILP A 1 R ♦ • : J V V 7 POSLMR 

* Y N T U R * * Y—' YOLVV-JKL * H L S H Y T G 

o i i p i o ft [) t] ts. ytl rsc £n t $1 v i t o / 

f TGAtATA ir ACCAAT ACATAATGiTlAC TACCAGCT AT ACGTTGACAAGTTGTAACACCTCaGTCATT ACaCAGCL 


6250 6260 6270 

b?%0 

62 90 

6300 

6310 

L V L 9 F * N V J 

1 

I R 9 

S 1 E 0 

D H V 

0 r. S a 


I 
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iDiFvA'4' /< 

- -16n0V.84-129099 

^SOPKTACTTCrCKKCCFHC 

O ' VSLK LLVPLAIVKSVAFI a 
CACuAACTCACCCTAAAACTCCTTCTACCacttcctattctaaaaactcttgctttcattc 
o 5350 5360 5370 5300 5390 5500 


ATKT-SSP. OSDS SSFSIKAVS 
oRRRPPQGSCTHOVSLSKOYV 9 

soeollkavrlikfuyossk* 

agcgacgaacacctcctcaaggcagtcagactcatcaagtttctctatcaaagcagtaagt 

0 5 A 70 5 A 80 5590 5500 5510 5520 .» 


SMSCVVHSNHRI«EN1KTKK 
I A I V VHS IV I I E Y R K IURO'RK 
♦ 0*LCGPP + 5*NIGKY*OKEK 
TAGCAATAGTTGTGTGGTCCATAGTAATCATACAATATAGGAAAATATTAAGACAAAGAAA 
i 5590 5600 5610 5620 5630 5650 
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RRNI STCGDGGGNGAPCSLG 
GE ISALVERGVEMGHHAPWO 
K E K Y OHL W R W G W K W G T H L L G 1 
aACCACAAATATCAGCACTTGTCGAGATGGCGCTCGAAATGGGGCACCATGCTCCTTCGGA 
; 5710 5720 5730 5750 5750 5760 


"V 


3 

) 


CGPKQPP LYFVHOULKHMIO 
VEGSMHHSILCIRC*SI*YR J 

VdK.EAT’TTLFCASOAKAYDTE 
.TGTGGAAGGAAGCAACCACCACTCTATTTTGTGCATCAGATGCTAAAGCATATGATACAG 

5330 5850 5850 5960 5870 5880 I 


*Yh »M*OKILTCGKHTM*NR > 

■ ' I G K C O R KF*HVEK*HGRTO 

V V L V N V T| E N -F N .1 a K N 0 .1 V E. 0 H 

.tagtattggtaaatgtgacagaa'aattttaacatgtggaaaaatcacatggtagaacaga » 

5950 5960 5970 5980 5990 6000 

) 


Tt.C*FKVH*FG E C V ♦ Y 0_*-* ♦ 

■>lcvslkctolg h A TJ N T* h S SI N ) 

. CAC TCT CTGTTAGTTTAAAGTGCACTGATTTGGGCAAIGLIAC I AATACC AATACTACTA 
> 6070 6090 6090 6100 6110 6120 

> 

SISAOAPEWRCPKNHHFFIM 
0 Y Q HKHKR5GAER'ICIFL*T ) 


TCAAlATCAGCACAAGCATAAGAGGTAAGGTCCAGAAAGAATATCCATTTTTTTATAAAC 

6190. 6200 6210 6220 6230 6250 ) 


OSLHRPY-0RYPL50FPYIIV > 

SHYTGLSKG1L*ANSHTLLC 
sJv-ITOACPKVSFEPIPIHYCA 

CaGTCATTACACaCGCCTGTCCAAAGGTATCCTTTCAGCCAATTCCCATACATT ATTGTC t 

6310 . 6320 6330 6350 6350 6360 


VOKSAOYNVH M 'TL G 0 ♦ Y 0 L N 

# t 

\ 


:i 

•i 



^~16nOV.8lp^9j!39 

PCWFCOSS TV J w 'I R T f 1 K v- ': 

P 4 f. F A I l K C :j /'« K Tl F fy r, C p C T f i v si 

CCCCCCCTCf.TTTTCCCATTCT AAAATCTA AT A/. T A AG AC GTTC A ATGG A AC ACGACC AT GT AC A a/.T GT C ACt 
6370 63 HO 6300 6400 6410 6420 6*30 


CC*.HAV*OKK R **LOLP I S 0 T N L K I 
A V e. V a SSRRRGSN*IC 0 F H R 0 C ♦ N 
L U b G SI LAEEEVVIR5A IN F _JJ 0 N A K T 
ToCTCTTGAATGGCACT CT ACCAGAAC AACAGGT AC t AATTAGATCTCCC AATTTCACAGACAATCCTAAAACC 
6400 6500 6510 6520 6530 6540 6550 


PrT'IOEKVSVS*GOOGCHLLO*EK« 
0 0 3 V _ K KKYPYPEGTRSS ICYNPKNi 

N Ifi h ~n RKSIRIORGPGRAFVTIGKl 
CCAACAACAATACAACAAAAAGTATCCGTATCCAGAGGGGACCAGGGAGAGCATTTGTTACAATACGAAAAATAI 
6610 6620 6630 6640 6650 6660 6670 


_E_I_ * _N_ -•_» L. A M ♦ _E. N 1. ..E.....L. I. .K....0 _♦_ 5 L S W 

CHFKT0S6QIKRTIMK' ♦ ♦ W N N' L ♦ A 

-A Tl LKOIASKLREOFCN IN K I-1 I I" F K 0 
ATGCCACTTTAAAACAGATAGCTAGCAAATTAAGAGAACAATTTGGAAATAATAAAACAATAATCTTTAAGCAA 
6730 6740 6750 6760 6770 6780 6790 


I GNFSTVIOH NCL IVLCLIVUCYLH 1 
K G I F L L » F -N T T V » » Y L V » » - Y L E Y * 

G E F F Y C /N S_Tj 0 L F |N $ T) U F fN S Tl K 5 T E 

CAGGGG AATTTTTCT ACTGT AATTCAACACAACTGjTTTAAT ACT ACTTCGjTTT'AATACTACTTCGjAGT ACTGAA 
6850 6860 6870 6880 6e90 6900 6910 


E*NNL*TCGRK9EK0CHPLPSADKL 
nktiykhvagsrksnvcps north 

IKOFINHHOEVGKAMYAP PISCOl 
GAATAAAACAATTTATAAACATGTGCCAGCAAGTAGGAAAAGCAATGTATGCCCCTCCCATCAGCCCACAAATT 
6970 6980 6990 7000 7010 7020 7020 


IFTHGPRSSDLEEEIPCTICEVNY 
* * o o w v rdloturrr‘yegolek«i 

N N N IN G SI EIFRPGGG0HR0NWR5EL 
GTAATAACAACAAIGCGrCCGAGATCTTCAGACCTGGAGGAGGAGATATGAGGGACAATTGGAGAAGTGAATTA' 
7090 7100 7110 7120 7130 7140 7150 


FRORE E WCREKKEOhE*ELCSLGSM 
OGKEKSGAERKKSSGNRSFVPHVLf 
KAKRRV VORE K R AVG I G ALFL GFL 
CCAAGCCAAAGAGAAGAGTGGTGCAGAGAGAAAAAAGAGCAGTGGGAATAGCAGCTTTCTTCCTTGGGTTCTTGt 
7210 7220 7230 7240 7250 7260 7270 


YRP0N1f CLV*CSSRTIC*GLLRRNS 
TGOTIIVHYSAAAE OFAEGYPGAT/ 
OAROLLSGI YOQ-ONNLLR .AIEAOQ 
TACAGGCCAGACAATTATTGTCTGGTATAGTGCAGC AGCAGAACAATTTGCT GAGGGCT ATT GAGGCGCA AC ACC 
7330 7340 7350 7360 7370 7380 7390 


es *LW*OT*RI •VSSWGFGVALENSF 
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' T p r < ■ c o h S t i r r * n * a S r . i n s ^ 

•_ T J G P C T h V si TVOCTHf. IR»VVSToL 

. A AC ACG A CC ATGT AC A A A T 0 T C A GC AC AG T AC A AT OT AC AC A T GG A A T T AGGCC AC T AGrATCAACTCAAC 

'.() 6620 f> A TO 6660 6660 ' 6660 66 70 1.6*0 

P I SO TXLKP**rS*rNL*KL I voo 
0 FH ROC*NHNSTA F P t CRN ♦ L V K T 

■ IN F D N A K T I I V 0 L ft. 0 -SI V E I fN C £ R P 

IC AArTTCACACACAATCCTAAAACCATAATACTACACCTCAACCAATCTCT ACAAATTAAT TGTACAACAC 
■0 6569 6550 6560 6570 5530 - 6590 6600 


F M L L 0 ♦ E K ♦ E I * 0 K H I VTL Vf ONC 

S ICYNPKNRKYE TSTL ♦ H ♦ ♦ S K N E 

AFVTICK tGNNRQAHC If. 1 S| R A X M 
.AGCATTTGTTACAATAGGAAAAATAGGAAATATGAGACAAGCACATTCTAACATTAGTACAGCAAAATGGA 
>0 6660 6670 6640 6690 6 700 6710 6720 



I . I. .K_0_*_ S L 5 N P 0 E _ C_._t._0 - K.L... . ♦_.. B_I_V _ L ! » 

» » n :, , n'l*ailrrgprncnaof*lw 

in * iifkossggopeivthsfncg 
.taataaaacaataatctttaagcaatcctcaggaggggacccagaaattgt aacgcacagtttta attgtg 


6780 

6790 

6800 

6810 6320 6830 

6890 

V L 

G V L K 

G 0 

ITLKFVT05HS 

H A 

* . Y L 

f Y * R 

V K 

♦ HARK ♦H NH TP 

*1 C 


F >N S n W 5 T E G S IN N Tl EGSOTt T L P C R 
t T TAATAG TACTTGGAGT ACTG AAGGGTC A AaT A ACACTG AAGGA AGTGAC AC A ATCACACTCCC ATGCA 
<)' 6900 ' 6910 6920 6930 6960 6950 6960 


mPLPSADKLOVHOILOG C Y * 0 E H V 

C P S H 0 R T H ♦ .1 F I K Y Y_ RAAINKRWN 

APPISGOIRCSS IN I tJ GLLLTRDG C 
T CCCCCTCCCATCAGCGGACAAATTAGATGTTCATCA A AT ATT ACAGGGCTGCTATTAACAAGAGATGGT.G 
^ 7020 7030 7060 7050 7060 7070 7080 


*GTIGEVNYIN1KA*KLNH*E*HP 

E G 0 L E K * l I *l*SSKN*T IRSSTH 
ronwrselykykvvkieplgwapt 
CAGGGACAATTGGAGAAGTGAATTATATAAATATAAAGTAGTAAAAATTGAACCATT AGGAGT agcaccc a 
•) 7160 7150 7160 7170 7180 7190 7200 


* ELCSLGSWEOaEALRAHCOPRAR 
RSFVPWVLGSSRKHYCRTVNDAOG 

r, alflgflg aagstoga r s ntl t v- 

acgagctttgttccttgggttcttggcagcagcaggaagcactatgggcgcacggtcaatgacgctgacgg 
0 7260 7270 7230 7290 7300 7310 7320 


C*GLLRRNSICCNSOSGA$SSSRO 

AFGYAGATASVATHSLGHOAAPGK 

LRAIEAOOHLLOLTVRGIKOLOAR 

-.ctgacggctattcacccccaacagcatctgttgcaactcacactctcgcccatcaagcagctccagccaa 


7380 7390 

7900 

7610 7620 

7930 

7990 

cvalensf 

A p 

L L C L C N L 

V G V 

t N L 


I 



-\~^6nov.ai^2p-jp9 

D F A “ 

NPCCGKIPKGSTAPGOLGLLBKTh 
ILAVEBYLK 00OUL C I w GC5 GKLI 
GAATCCTGGCTCTCGAAAGATaCCTAAAGGATCAACAGCTCCTCCGGATTTGGGGTTGCTCTGGAAAACTCATI 
7A 50 7460 7470 7480 7490 7500 7510. 


’ WNRFGIT*PG KSGTEKL TI T Q A * Y ] 
GTOLEAHOLDCVG ORK* 0 L H K L N T 
E a I M N Iff" N~ A M N E M 0 8 E I M IN T n S L I H 

TGGAACAGATTTGGAATAACATGACCTGGATGGAGTGGGACAGAGAAATTAACAATTACACAAGCTTAATACA1 
7570 7580 7590 7600 7610 7620 7630 


NYrtN*INGOVCSlGLT*atGCCI*K 
IIGIRAMGKFVELV* H U K L A V V Y K 
LLELOKWASLWNkF I Ti !< ' W L W Y 1 K 
AATTATTCGAATTAGATAAATCGGCAAGTTTGTCGAATTGGTTTAACATAACAAATTGGCTGTGCTATATAAAA 
7690 7700 7710 7720 7730 7740 7750 r 


l-LYFL**IELG80IHHYRFRPTSOP 
CCTFYSE*S*AGIFT1 I V S 0 P P P N 

avlsjv/nrvrogysplsfot'hlpt 

ttgctctactttctatagtgaatagagttacgcacgcatattcaccattatcgtttcagacccacctcccaacc 

7810 7820 7830 7850 7850 7860 7870 


»ETETDPFDA*TDP*HLSCT icgal 
E R 0 P 0 I H S I S E B I_ L STYLGRSA EP 
RDRORSIRLV In G $ \ LALIWOOLRSL 
AGAGAGAC AGAGACAGATCCATTCGATTAGTGAACGGATCC 7 TACCACTTATCTCCCACGATCTCC.GGAGCCTT 
7930 7950 7950 7960 7970 7980 7990 


TRIVELLGfcRG WEAi.KYWWNLLOYW 
RCLWNFWOAGGGKPSNICGI SYSI 
eocgtsgtogygspo il ve sp t vl 

ACCAGGATTGTGGAACT TCT GGGACGC AGGGGGT GGGAAGCCCTCAAATATTGCTCCAATCTCCTACAGTATTG 
8050 8060 8070 8030 8090 8100 8110 


AIA VAEGTORVIEVVOGACRA1RHI 
P*3* LRG0IGL*K*YKELVELFAT 
HS5S*GDR*GYKSSTRSL*SYSPH 

gccatagcagtagctgacgggacagatagggttatagaagtactacaagcaccttgtacacctattcgccacat 

8170 8180 8190 8200 3210 3220 8230 


CH0y VKK*CGli( 1AYCKGKNETS5AS 
CCK ‘ < SRSSVVGWPrvSERHRRA EP 
r * 4 sc °KVVrfLOGLL*GKE*OELSO 

GGGTGGCAAGTGGTCAAAAAGTAGTGTCGTTGGATGCCCT ACTCTAAGGGAAAGAATGAGACGAGCTCAGCCAC 
8290 8300 8310 0320 3330 3350 8350 


S . H t K < % °» Y t S S Y 0 C C L C L 4 R s T * G C G C 

AATkaaCAKLFAOEEEE 
°SOVAI aOLP«L'LVPG*KHKR B 3 B 


4ccaatcacaag7agcaat4Caccagctaccaatgctgcttgtgcctgcct(aga'agcacFagaScagcaggagc 

8410 8420 8430 8440 3450 '— "■—*54 l u . 

\UrU*b- i 

N S L Pi/* 

•IS fis 
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r<) jJL n h ^ 

- - -/-—OF A 

WKTHLH HCCALE C • L E ♦ * I S 

v K ;L I C T T A V P U t% A j| W S TTT k J L 
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VVYK NIHN DSRRLGRFKN SF 
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TCTCGTATATAAAAATATTCATAATCATACTAGGAGGCTTCGTACGTTTAAGAATACTTT 
7750 7760 7770 7780 7790 7800 
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CCCACCTCCCAACCCCGAGGGCACCCGACAGGCCCCAACGAATAGAAGAAGAAGGTGGAC 
7870 7830 7390 7900 7910 7920 
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y^TCTGCGGAGCCTTGTGCCTCTTCAGCT ACCACCGCTTGACAGACTTACTCTTGATTGTA 
7990 8000 8010 8020 8030 8060 
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10 20 30 40 50 60 

AAGCTTGCCT TGAGTGCTTC AAGTACTGTC TCCCCGTCTC TTGTGTGACT CrGGTAACT a 

70 80 40 100 110 120 

GAGATCCCTC ACACCCTTTT ACTCAGTCTG GAAAATCTCT AGCAGTGCCG CCCCAACACG 


GACTTGAAAC CGAAAGGGAA ACCAGAGGAG CTCTCTCCAC GCAGGACTCG GCTTGCTCAA 


190 200 210 220 230 240 

GCGCGCACGG CAAGAGGC6A GGGGAGGCCA CTCCTGACTA CGCCAAAAAT TTTGACTAGC 

250 260 270 280 290 300 

GGAGGCTAGA AGGAGAGAGA TGGGTGCGAG AGCGTCACTA TTAAGCGGGG GAGAATTAGA 

310 320 330 340 350 360 

TCGATGGGAA AAAATTCGGT TAAGGCCAGG GGGAAAGAAA AAATATAAAT TAAAACATAT 

370 380 390 400 410 420 

AGTATGGGCA AGCACGGAGC TAGAACGATT CGCTGTTAAT CCTGGCCTGT TAGAAACATC 


430 440 450 460 470 460 

A6AAGGCTGT AGACAAATAC TGGGACAGCT ACAACCATCC CTTCAGACAG GATCAGAACA 

490 500 510 520 930 540 

ACTTAGATCA TTATATAATA CAGTAGCAAC CCTCTATTCT GTGCATCAAA GGATAGAGAT 

550 560 570 980 590 600 

AAAAGACACC AAGGAAGCTT TACACAAGAT AGAGGAAGAG CAAAACAAAA GTAAGAAAAA 

610 620 630 640 650 660 

AGCACAGCAA GCAGCAGCTG ACACAGGACA CAGCAGCCAG GTCAGCCAAA ATTACCCTAT 

670 680 690 700 710 720 

AGTGCAGAAC ATCCAGGGCC AAATGGTACA TCAGGCCATA TCACCTAGAA CTTTAAATCC 

730 740 750 760 770 780 

ATGGGTAAAA GTAGTACAAG AGAAGGCTTT CAGCCCAGAA GTCATACCCA TGTTTTCAGC 

790 800 810 820 830 840 

ATTATCAGAA GGAGCCACCC CACAAGATTT AAACACCATG CTAAACACAG TGCGCGGACA 

850 860 870 680 890 900 

TCAAGCACCC ATCCAAATGT TAAAAGAGAC CATCAATCAG GAAGCTGCAG AATGGGATAG 


910 920 930 940 950 960 

AGTGCATCCA GTGCATGCAG GGCCTATTGC ACCAGGCCAG ATCAGAGAAC CAACCGCAAG 

970 980 990 1000 1010 1070 

TGACATAGCA GGAACTACTA GTACCCTTCA CGAACAAATA GGATGGATCA CAAATAATCC 

1030 1040 1050 1060 1070 1080 

ACCTATCCCA GTAGGAGAAA TTTATAAAAG ATGGATAATC CTGGGATTAA ATAAAATAGT 

1090 1100 1110 1120 ”1130 1140 
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aagaatgtat 

agccctacca 

GCATTCTGGA 

CATAAGACAA 

GGACCAAAAG 

AACCCTTTAG 

1150 

1160 

1170 

1130 

1190 

1200 

AGACrATCTA 

CACCGGTTCT 

ataaaactct 

AAGAGCCCAG 

CAAGCTTCAC 

aggaggtaaa 

1210 

1220 

1230 

1240 

1250 

1260 

AAATTCCATC 

ACAGAAACCT 

TCTTCCTCCA 

AAATGCGAAC 

CCAGATTGTA 

ACACTATTTT 

1270 

1280 

1290 

1300 

1310 

1320 

AAAACCATTC 

CGACCAGCAG 

ctacactaga 

AGAAATGATG 

ACAGCATGTC 

AGGGAGTCGC 

1330 

1350 

1350 

1360 

1370 

1380 

AGCACCCCCC 

CATAAGGCAA 

CAGTTTTGGC 

TGAAGCAATG 

ACCCAAGTAA 

caaattcagc 

1390 

15 00 

1410 

1420 

1530 

1450 

TACCATAATC 

ATGCAAAGAG 

GCAATTTTAG 

GAACCAAAGA 

AAGATTCTTA 

AGTGTTTCAA 

1450 

1560 

1570 

1580 

1590 

1500 

TTGTGCCAAA 

GAAGGGCACA 

TACCCAGAAA 

TTGCAGGGCC 

CCTAGGAAAA 

AGGGCTGTTG 

1510 

1520 

1530 

1540 

L550 

1560 

GAAATGTGCA 

AAGGAAGGAC 

ACCAAATGAA 

AGATTGTACT 

GAGAGACAGG 

CTAATTTTTT 

1570 

1580 

1590 

1600 

1610 

1620 

AGGGAAGATC 

TGCCCTTCCT 

ACAAGGCAAC GCCAGGGAAT 

TTTCTTCAGA 

GCAGACCAGA 

1630 

1650 

1650 

1660 

1670 

1680 

GCCAACAGCC 

CCACCAGAAC 

AGAGCTTCAG 

GTCTGGGGTA 

GAGACAACAA 

CTCCCTCTCA 

1690 

1700 

1710 

1720 

1730 

1750 

GAACCAGGAG 

CCGATAGACA 

AGGAACTCTA 

TCCTTTAACT 

TCCCTCAGAT 

CACTCTTTGG 

1750 

1760 

1770 

1780 

1790 

1800 

caacgacccc 

TCGTCACAAT 

AAAGATAGGG 

gggcaactaa 

AGGAAGCTCT 

attagataca 

1310 

1820 

1830 

1850 

1850 

1660 

ggagcagatg 

ATACAGTATT 

AGAAGAAATG 

AGTTTGCCAG 

GAAGATGCAA 

ACCAAAAATG 

1870 

1880 

1890 

1900 

1910 

1920 

ATAGGGGGAA 

TTCGAGGTTT 

TATCAAAGTA 

AGACAGTATG 

ATCAGATACT 

CATACAAATC 

1030 

1950 

1950 

1960 

1970 

I960 

TGTGGACATA 

AAGCT AT AGG 

TACAGTATTA 

GT AGGACCT A 

CACCTGTCAA 

CATAATTGCA 

1090 

2000 

2010 

2020 

2030 

2050 

agaaatctgt 

TGACTCAGAT 

TCGTTCCACT 

TTAAATTTTC 

CCATTAGTCC 

TATTGAAACT 

2050 

2060 

2070 

2060 

2090 

2100 

GTACCAGTAA 

AATTAAAGCC 

AGGAATGGAT 

ggcccaaaag 

TTAAACAATG 

CCCATTGACA 

2110 

2120 

2130 

2150 

2150 

2160 

GAAGAAAAAA 

taaaagcatt 

AGTAGAAATT 

TGTACAGAAA 

TGCAAAAGGA 

AGGGAAAATT 

2170 

2180 

2X90 

2200 

2210 

2220 

TCAAAAATTG 

GCCCTCAAAA 

TCCATACAAT 

ACTCCAGTAT 

TTGCCATAAA 

GAAAAAAGAC 

2230 

2250 

2250 

2260 

2270 

2280 

AGTACTAAAT 

GCAGAAAATT 

AGTAGATTTC 

AGACAACTTA 

ataagagaac 

TCAAGACTTC 

2200 

2300 

2310 

2320 

2330 

2350 

TGGGAAGTTC 

aattaggaat 

ACCACATCCC 

GCAGGGTTAA 

AAAAGAAAAA 

ATCAGTAACA 
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UI-CH.U..IU i oGU 1UA T GC ArATTTTttA OTTCCCrTAC ATGAAGACTT CACOAACUr 

2410 2420 2410 2440 2450 2460 

ACTCCATTTA CCATACCTAG TaTAAACAAT OACACACCAO GCATTACATA TCAGTACAAT 

2470 2480 2400 2500 2510 2520 

CTCCTTCCAC AGGGATGGAA AGCATCACCA GCAATATTCC AAAGTAGCAT GACAAAAATC 

2430 2540 2350 2560 2570 2580 

TTAGAGCCTT TTaCAAAACA AAATCCAGAC ATAGTTATCT ATCAATACAT GGATGATTTG 

2500 2600 2610 2620 2630 2640 

TATGTAGGAT ctgacttaca aatagggcag catagaacaa aaatacagga gctgagacaa 

2650 2660 2670 2680 2690 2700 

CATCTGTTGA GGTGGGGACt TACCACACCA GACAAAAAAC ATCACAAAGA ACCTCCATTC 

2710 2720 2730 2740 2750 2760 

CTTTGGATGG GTTATGAACT CCATCCTGAT AAATGGACAG TACACCCTAT A6TGCTGCCA 

2770 2780 2700 2800 2810 2820 

GAAAAAGACA GCTCGACTGT CAATGACATA CAGAAGTTAC TGGCAAAATT CAATTGGGCA 

2830 2840 2850 2860 2870 2880 

AGTCAGATTT ACCCAGCGAT TAAAGTAACG CAATTATGTA AACTCCTTAG AGGAACCAAA 

2800 2900 2910 2920 2030 2040 

GCACTAACAG aactaatacc ACTAACAGAA GAAGCAGACC tagaactggc agaaaacaga 

2950 2960 2970 2980 2990 . 3000 

GAGATTCTAA AAGAACC.AGT ACATGGAGTC TATTATGACC CATCAAAAGA CTTAATAGCA 

3010 3020 3030 3040 3050 3060 

GAAATACAGA accacgggca acgccaatgg acatatcaaa tttatcaaga cccatttaaa 

3070 3080 3000 3100 3110 3120 

aatctgaaaa caggaaaata tgcaagaacg acgggtgccc acactaatga tgtaaaacaa 

3130 3140 3150 3160 3170 3180 

TTAACAGAGG CAGTGCAAAA AATAACCACA GAAAGCATAG TAATATGGGG AAAGACTCCT 

3100 3200 3210 3220 3230 3240 

AAATTTAAAC tacccataca aaaggaaaca tgccaaacat GGTGGACAGA GTATTGGCAA 

3250 3260 3270 3280 3200 3300 

CCCACCTC6A TTCCTGAGTG GGAGTTTGTC AATACCCCTC CTTTAGTGAA ATTATGCTAC 

3310 3320 3330 3340 3350 3360 

CACTTAGAGA AAGAACCCAT AGTAGCAGCA GAAACGTTCT ATGTAGATGG GGCAGCTAGC 

3370 3380 3390 3400 3410 3420 

ACGGAGACTA AATTAGGAAA AGCAGGATAT CTTACTAATA GAGGAAGACA AAAAGTTGTC 

3430 3440 3450 3460 3470 3480 

ACCCTAACTG ACACAACAAA TCAGAAGACT GAGTTACAAG CAATTCATCT AGCTTTCCAG 

3490 3500 3510 3520 3530 3540 

GATTCGCGAT TAGAAGTAAA TATAGTAACA GACTCACAAT ATGCAT7AGG AATCATTCAA 

3550 3560 3570 3580 3590 • 3600 

GCACAACCAG ATAAAAGTGA ATCAGAGTTA GfCAATCAAA TAATAGAGCA CTTAATAAAA 

3620 3630 3640 3650 


3610 


3660 
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GTAGATAAAT TACTCAGTCC TGCAATCAGG AAAGTACTAT TTTTAGATGG AATAGM- ■) 



3 no 

CCCCAAGATG 

3790 

CTGCCACCTG 


3 740 3750 3760 3770 3760 

AACATGAGAA ATATCACAGT AATTGGAGAG CAATGGCTAG TGATTTTAAC 

3800 3810 3920 3830 3840 

TAGTAGCAAA AGAAATAGTA GCCAGCTGTC ATAAATGTCA GCTAAAAGGA 


3850 3860 3870 3880 3890 3900 

GAAGCCATGC ATGGACAAGT ACACTGTAGT CCAGGAATAT GGCAACTACA TTGTACACAT 


3910 3920 3930 3940 3950 3960 

TTACAACCAA AAGTTATCCT CCTACCAGTT CATCTAGCCA GTGGATATAT ACAAGCACAA 

3970 3980 3990 4000 4010 . 4020 

GTTATTCCAG CAGAAACAG6 GCAGGAAACA GCATACTTTC TTTTAAAATT AGCAGCAACA 

4030 4040 4050 4060 4070 4080 

TCGCCAGTAA AAACAATACA TACAGACAAT GCCAGCAATT TCACCAGTAC TACCGTTAAG 

4090 4100 4110 4120 4130 4140 

GCCGCCTGTT GGTGGCCGGG AATCAAGCAG GAATTTGGAA TTCCCTACAA TCCCCAAACT 

4150 4160 4170 4180 4190 4200 

CAAGGAGTAG TAGAATCTAT GAATAAAGAA TTAAAGAAAA TTATAGGCCA GCTAACAGAT 

4210 4220 4230 4240 4250 4260 

CAGCCTGAAC ATCTTAAGAC ACCAGTACAA ATGGCACTAT TCATCCACAA TTTTAAAACA 

4270 4280* 4290 4300 4310 4320 

AAAGGGGGGA TTGGGGGGTA CAGTCCAGGG GAAAGAATAG TACACATAAT AGCAACAGAC 

4330 4340 4350 4360 4370 4380 

ATACAAACTA AAGAATTACA AAAACAAATT ACAAAAATTC AAAATTTTCG GGTTTATTAC 

4390 4400 4410 4420 4430 4440 

AGCCACAGCA GAGATCCACT TTGGAAAGGA CCAGCAAAGC TCCTCTCGAA AGCTCAAGCG 

4450 4460 4470 4480 4490 4500 

GCAGTAGTAA TACAAGATAA TAGTGACATA AAAGTAGTGC CAAGAAGAAA AGCAAACATC 

4510 4520 4530 4540 4550 4560 

attagggatt atgcaaaaca GATGGCAGGT GATCATTCTC TCGCAAGTAG ACAGGATGAG 

4570 4580 4590 4600 4610 4620 

GATTAGAACA TGGAAAAGTT TAGTAAAACA CCATATGTAT GTTTCAGGGA AAGCTAGGGG 

4630 4640 4650 4660 4670 4680 

ATCGTTTTAT AGACATCACT ATGAAACCCC TCATCCAAGA ATAAGTTCAG AAGTACACAT 

4690 4700 4710 4720 4730 4740 

CCCACTAGGG GATGCTAGAT TGGTAATAAC AACATATTGG GCTCTGCATA CAGGAGAAAG 

4750 4760 4770 4780 4790 4800 

AGACTGGCAT CTCGGTCAGG GAGTCTCCAT AGAATGGACC AAAAACAGAT AT ACCACACA 

4810 4820 4830 4840 4850 4860 

AGTAGACCCT GAACTAGCAG ACCAACTAAT TCATCTGTAT TACTTTGACT GTTTTTCAGA 

4870 4880 4890 4900 4910 4920 
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4990 3000 9010 9020 9030 90*0 

aaagataaac ccacctttgc ctagtgttac gaaactgaca gaggatagat GGAACAAGCC 

9090 9060 9070 9080 9090 9100 

CCAGAAGACC AAGGGCCACA GAGCGAGCCA CACAATGAAT CGACACTACA GCTTTTaCAG 

9110 9120 9130 91*0 9190 9160 

GAGCTTAAGA ATGAAGCTGT tagacatttt cctacgattt ggctccatcg cttagggcaa 


9170 9180 5190 9200 9210 9220 

catatctatg aaacttatgg ggatacttgg ccacgagtgg aagccataat aagaattctg 


9230 92*0 9290 9260 9270 9280 

caacaactgc tgtttatcca tttcagaatt gggtgtcgac atagcagaat aggcgttact 

9290 9300 9310 9320 5330 93*0 

caacagacga caccaagaaa tggagccagt agatcctaga ctagagccct ggaagcatcc 

9390 9360 9370 9380 5390 9*00 

aggaagtcag cctaaaactg cttgtaccac ttgctattgt aaaaagtgtt gctttcattc 

9*10 9*20 9*30 9**0 9*90 9*60 

ccaagtttgt ttcacaacaa aagccttagg catctcctat ggcaggaaga agcggagaca 

9*70 9*80 9*90 9500 9510 5520 

GCGACGAAGA CCTCCTCAAG GCAGTCAGAC TCATCAAGTT TCTCTATCAA AGCAGTAAGT 

9530 59*0 ‘ 5550 5560 5970 5580 

AGTACATGTA ATGCAACCTA TACAAATACC AATAGCAGCA TTAGTAGTAG CAATAATAAT 

5590 5600 5610 5620 5630 56*0 

AGCAATAGTT GTGTGGTCCA TACTAATCAT AGAATATAGG AAAATATTAA CACAAAGAAA 

5650 5660 5670 5680 5690 5700 

AATAGACAGG TTAATTGATA GACTAATAGA AAGACCAGAA GACACTGGCA ATGACAGTGA 

5710 5720 5730 57*0 5750 5760 

AGGACAAATA TCAGCACTTG TGGAGATGGG CCTGGAAATG GGGCACCATC CTCCTTCGGA 

5770 9780 5790 5800 5810 5820 

TATTCATCAT CTGTAGTGCT ACAGAAAAAT TGTGGGTCAC A6TCTATTAT CGGGTACCTG 

5630 58*0 5850 5860 5870 5860 

TGTGGAAGGA AGCAACCACC ACTCTATTTT GTGCATCAGA TGCTAAAGCA TATGATACAG 

5690 5900 5910 5920 5930 ' 59*0 

ACGTACATAA TGTTTGGGCC ACACATGCCT GTGTACCCAC AGACCCCAAC CCACAAGAAG 

5950 5960 5970 5980 5990 " 6000 

TAGTATTCGT AAATCTGACA GAAAATTTTA ACATCTGGAA AAATCACATG CTAGAACACA 

6010 6020 6030 60*0 6050 1 6060 
TGCATGACGA TATAATCACT TTATGGGATC AAAGCCTAAA CCCATGTGTA AAATTAACCC 

6070 6080 6090 6100 6110 6120 

CACTCTGTGT TAGTTTAAAG TGCACTGATT TGGCGAATCC TACTAATACC AATAGTAGTA 

6170 "^",6180 


6130 


61*0 


6150 


6160 
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AT ACC A AT AG TAC 3CG CAAATCATGA T CGAGAAA( , AGAT AAAA AACTCCTCTT 
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6100 6200 6210 6220 6230 629C 

tcaatatcag CACAACCATA AGAGGTAAGG TGCAGAAACA ATATGCATTT TTTTATAAaC 

6250 . 6260 6270 6280 6290 6300 

TTGATATAAT accaatagat AATGATACTA ccagctatac gttgacaact tgtaacacct 

6310 6320 6330 6360 6350 6360 

CAGTCATTAC ACAGGCCTGT CCAAAGGTAT CCTTTGAGCC AATTCCCATA CATTATTCTG 

6370 6300 6390 6600 6910 6920 

CCCCGGCTGG TTTTGCGATT CTAAAATGTA ATAATAAGAC GTTCAATCCA ACAGGACCAT 

6930 6990 6950 6960 6970 69S0 

GTACAAATGT CAGCACAGTA CAATGTACAC ATGGAATTAG GCCAGTAGTA TCAACTCAAC 

6990 6500 6510 6520 6530 6590 

TGCTGTTGAA TGGCAGTCTA GCAGAAGAAG AGGTAGTAAT TAGATCTGCC AATTTCACAO 

6550 6560 6570 -6580 6590 6600 

ACAATGCTAA AACCATAATA gtacagctga accaatctgt ACAAATTAAT TCTACAAGAC 

6610 6620 6630 6690 6650 6660 

CCAACAACAA TACAAGAAAA AGTATCCGTA TCCAGAGGGG ACCAGGGAGA GCATTTGTTA 

6670 6660 6690 6700 6710 6720 

CAATAGGAAA AATAGGAAAT ATGAGACAAG CACATTGTAA CATTAGTAGA GCAAAATGGA 

6730 6790 6750 6760 6770 6700 

ATGCCACTTT aaaacagata gctagcaaat taagagaaca atttggaaat aataaaacaa 

6790 6000 6010 6820 6830 6890 

taatctttaa gcaatcctca ggaggggacc cacaaattgt aacgcacagt tttaattgtg 

6850 6060 6870 6880 6890 6900 

GAGGGGAATT TTTCTACTGT AATTCAACAC AACTGTTTAA TAGTACTTGG TTTAATACTA 

6910 6920 6930 6990 6950 6960 

CTTGGAGTAC TGAAGGGTCA AATAACACTG AAGGAAGTGA CACAATCACA ctcccatgca 

6970 6980 6990 7000 7010 7020 

GAATAAAACA ATTTATAAAC ATGTGCCAGG AAGTACCAAA ACCAATGTAT GCCCCTCCCA 

7030 7090 7050 7060 7070 7080 

TCAGCGGACA AATTAGATGT TCATCAAATA TTACAGGGCT GCTATTAACA AGAGATGGTG 

7090 7100 7110 7120 7130 7190 

GTAATAACAA CAATGGGTCC GAGATCTTCA GACCTGGAGG AGGAGATATC AGGGACAATT 

7150 7160 7170 7180 7190 7200 

GCAGAAGTGA ATTATATAAA TATAAAGTAG TAAAAATTGA ACCATTACGA GTAGCACCCA 

7210 -7220 7230 7290 7250 •7260 

CCAAGGCAAA CAGAACAGTG GTGCACAGAG AAAAAACAGC AGTGCGAATA GGAGCTTTGT 

7270 7280 7290 7300 7310 7320 

TCCTTGGGTT CTTGGGAGCA GCAGGAAGCA CTATGGGCGC ACGGTCAATG ACGCTGACGC 

7330 7390 7350 7360 7370 7380 

TACAGCCCAG ACAATTATTG TCTCGTATAG TGCAGCAGCA GAACAATTTG^CTGAGGGCTA 

7390 7900 7910 7920 7930 ’ 7990 
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.AGGCGCA ACAOCATC. 

; 7450 7460 

GaATCCTCCC TGTCCAAACA 


G C •• AC T C A CAGTCTGGGG CAK . CAG CTCCAGCCAA 
0 FA 

7470 7480 7490 7500 

TACCUAAGG ATCAACAGCt CCTCGGCATT TGGCCTTCCT 


7510 7520 7510 7540 7550 7560 

CTGGAAAACT cautgcacc actcctgtgc cttggaatgc tacttggagt aataaatctc 


7570 7580 7590 7600 7610 7620 

TGGAACAGAT TTCGAATAAC ATGACCTCGA TGGAGrGGGA CAGACAAATT AACAATTACA 

7610 7640 7650 7660 7670 7680 

caagcttaat acattcctta attgaacaat CGCAAAACCA GCAAGAAAAG AATGAACAAC 



7690 7700 7710 7720 7730 7740 

AATTATTGGA attagataaa tgggcaagtt tgtcgaattg gtttaacata acaaattggc 

7750 7760 7770 7780 7790 7800 

tgtggtatat aaaaatattc ataatcatag taggaggctt ggtaggttta agaatagttt 

7810 7820 7810 7840 7850 7860 

ttgctgtact ttctatactg aatacagtta gccaccgata ttcaccatta tcgtttcaga 


7870 7880 7890 7900 7910 7920 

CCCACCTCCC aaccccgagc ggacccgaca ggcccgaagg aatagaagaa gaaggtccac 

7930 7940 7950 7960 7970 7980 

AGAGAGACAG AGACAGATCC ATTCGATTAG TGAACGGATC CTTAGCACTT AfCTGGGACG 


7V90 8000 8010 8020 8030 8040 

ATCTCCGGAC CCTTCTGCCT CTTCACCTAC CACCGCTTGA GAGACTTACT CTTGATTGTA 

8050 8060 8070 8080 8090 8100 

ACGACCATTG TGGAACTTCT GGGACGCAGG GGGTGGGAAG CCCTCAAATA TTGGTGCAAT 


8110 8120 8130 8140 8150 8160 

CTCCTACAGT attggagtca ggaactaaag AATAGTGCTG TTAGCTTGCT CAATGCCACA 

8170 8180 8190 8200 8210 8220 

CCCATAGCAG TAGCTGAGGG GACAGATAGC GTTATAGAAG TAGTACAAGC ACCTTGTACA 

8230 8240 8250 8260 8270 8280 

GCTATTCCCC ACATACCTAG AAGAATAAGA CAGGCCTTGG AAAGCATTTT CCTATAA6AT 

8290 8300 8310 8320 8330 8340 

GGGTCGCAAG TGGTCAAAAA CTAGTCTGCT TGGATCCCCT ACTCTAAGGG AAAGAATGAG 

8350 6360 8370 8380 8390 8400 

ACGAGCTGAG CCACCAGCAG ATGGGGTGGG AGCAGCATCT CGAGACCTGG AAAAACATGG 

8410 0420 8430 8440 8450 8460 

ACCAATCACA AGTAGCAATA CAGCAGCTAC CAATGCTGCT TGTGCCTGGC TAGAAGCACA 

8470 8480 8490 8500 8510 8520 

AGAGCAGGAG GAGGTGGGTT TTCCACTCAC ACCTCACGTA CCTTTAAGAC CAATCACTTA 

8530 8540 8550 8560 8570 8580 

CAAGGCAGCT CTAGATCTTA CCCACTTTTT AAAACAAAAG GGCGGACTGG AAGGGCTAAT 

8590 8600 8610 8620 8630 8640 

TCACTCCCAA CGAAGACAAG ATATCCTTGA TCTGTGGATC 7ACCACACAC AAGCCTACTT 

8650 8660 8670 8680 8690 8700 








